智能论文笔记

Multistep traffic speed prediction: A deep learning based approach using latent space mapping considering spatio-temporal dependencies

Shatrughan Modi , Jhilik Bhattacharya , Prasenjit Basak

分类：机器学习 | 人工智能

2021-11-03

由于道路上越来越多的车辆，城市的交通管理已成为一个主要问题。智能交通系统（其）可以帮助城市交通管理者通过提供准确的流量预测来解决问题。为此，它需要一种可靠的业务预测算法，其可以基于过去和当前的业务数据在多个时间步骤中提供准确的流量预测。近年来，已经提出了许多不同的交通预测方法，这些方法已经证明了它们在准确性方面的有效性。然而，这些方法中的大多数都认为仅包括空间信息或时间信息并忽略了其他的效果。在本文中，为了解决上述问题，使用空间和时间依赖性开发了基于深度学习的方法。要考虑时空依赖项，基于交通相似度和距离等属性选择特定即时的附近的道路传感器。使用潜在空间映射的概念交叉连接两个预训练的深度自动编码器，并且使用从所选附近传感器的流量数据培训所得模型作为输入。使用从洛杉矶和湾区的不同高速公路上安装的Loop Detector传感器收集的现实世界交通数据培训了所提出的深度学习方法。来自加利福尼亚州运输绩效测量系统（PEMS）的网络门户网站自由提供交通数据。通过将其与许多机/深度学习方法进行比较来验证所提出的方法的有效性。已经发现，所提出的方法即使对于比其他技术最小的误差，即使超过60分钟的前方预测也提供了准确的流量预测结果。

translated by 谷歌翻译

Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends Among Healthcare Facilities

Anudeep Appe , Bhanu Poluparthi , Lakshmi Kasivajjula , Udai Mv , Sobha Bagadi , Punya Modi , Aditya Singh , Hemanth Gunupudi

分类：机器学习

2022-12-09

The necessity of data driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a Healthcare Provider Facility or a Hospital (from here on termed as Facility) Market Share is of key importance. This pilot study aims at developing a data driven Machine Learning - Regression framework which aids strategists in formulating key decisions to improve the Facilitys Market Share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study; and the data spanning across 60 key Facilities in Washington State and about 3 years of historical data is considered. In the current analysis Market Share is termed as the ratio of facility encounters to the total encounters among the group of potential competitor facilities. The current study proposes a novel two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP, to quantify the relative importance of features impacting the market share. The proposed method to identify pool of competitors in current analysis, develops Directed Acyclic Graphs (DAGs), feature level word vectors and evaluates the key connected components at facility level. This technique is robust since its data driven which minimizes the bias from empirical techniques. Post identifying the set of competitors among facilities, developed Regression model to predict the Market share. For relative quantification of features at a facility level, incorporated SHAP a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share.

translated by 谷歌翻译

Is Bio-Inspired Learning Better than Backprop? Benchmarking Bio Learning vs. Backprop

Manas Gupta , Sarthak Ketanbhai Modi , Hang Zhang , Joon Hei Lee , Joo Hwee Lim

分类：机器学习

2022-12-09

Bio-inspired learning has been gaining popularity recently given that Backpropagation (BP) is not considered biologically plausible. Many algorithms have been proposed in the literature which are all more biologically plausible than BP. However, apart from overcoming the biological implausibility of BP, a strong motivation for using Bio-inspired algorithms remains lacking. In this study, we undertake a holistic comparison of BP vs. multiple Bio-inspired algorithms to answer the question of whether Bio-learning offers additional benefits over BP, rather than just biological plausibility. We test Bio-algorithms under different design choices such as access to only partial training data, resource constraints in terms of the number of training epochs, sparsification of the neural network parameters and addition of noise to input samples. Through these experiments, we notably find two key advantages of Bio-algorithms over BP. Firstly, Bio-algorithms perform much better than BP when the entire training dataset is not supplied. Four of the five Bio-algorithms tested outperform BP by upto 5% accuracy when only 20% of the training dataset is available. Secondly, even when the full dataset is available, Bio-algorithms learn much quicker and converge to a stable accuracy in far lesser training epochs than BP. Hebbian learning, specifically, is able to learn in just 5 epochs compared to around 100 epochs required by BP. These insights present practical reasons for utilising Bio-learning rather than just its biological plausibility and also point towards interesting new directions for future work on Bio-learning.

translated by 谷歌翻译

Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations

Ashwani Bhat , Ashutosh Modi

分类：自然语言处理 | 人工智能 | 机器学习

2022-11-07

Predicting emotions expressed in text is a well-studied problem in the NLP community. Recently there has been active research in extracting the cause of an emotion expressed in text. Most of the previous work has done causal emotion entailment in documents. In this work, we propose neural models to extract emotion cause span and entailment in conversations. For learning such models, we use RECCON dataset, which is annotated with cause spans at the utterance level. In particular, we propose MuTEC, an end-to-end Multi-Task learning framework for extracting emotions, emotion cause, and entailment in conversations. This is in contrast to existing baseline models that use ground truth emotions to extract the cause. MuTEC performs better than the baselines for most of the data folds provided in the dataset.

translated by 谷歌翻译

Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments

Abhinav Joshi , Naman Gupta , Jinang Shah , Binod Bhattarai , Ashutosh Modi , Danail Stoyanov

分类：计算机视觉 | 人工智能 | 机器学习

2022-11-07

A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneous sources and fusing them. However, in practice, the data acquired from different sources are typically noisy. In some extreme cases, a noise of large magnitude can completely alter the semantics of the data leading to inconsistencies in the parallel multimodal data. In this paper, we propose a novel method for multimodal representation learning in a noisy environment via the generalized product of experts technique. In the proposed method, we train a separate network for each modality to assess the credibility of information coming from that modality, and subsequently, the contribution from each modality is dynamically varied while estimating the joint distribution. We evaluate our method on two challenging benchmarks from two diverse domains: multimodal 3D hand-pose estimation and multimodal surgical video segmentation. We attain state-of-the-art performance on both benchmarks. Our extensive quantitative and qualitative evaluations show the advantages of our method compared to previous approaches.

translated by 谷歌翻译

BabyNet: A Lightweight Network for Infant Reaching Action Recognition in Unconstrained Environments to Support Future Pediatric Rehabilitation Applications

Amel Dechemi , Vikarn Bhakri , Ipsita Sahin , Arjun Modi , Julya Mestas , Pamodya Peiris , Dannya Enriquez Barrundia , Elena Kokkoni , Konstantinos Karydis

分类：计算机视觉 | 机器学习

2022-08-09

动作识别是提高物理康复设备自治的重要组成部分，例如可穿戴机器人外骨骼。现有的人类行动识别算法的重点是成人应用，而不是小儿应用。在本文中，我们介绍了BabyNet，这是一个轻量重量（就可训练的参数而言）的网络结构，以识别婴儿从外体固定摄像机中采取行动的婴儿。我们开发了一个带注释的数据集，其中包括在不受约束的环境中的不同婴儿（例如，在家庭设置等）中的坐姿中执行的各种范围。我们的方法使用带注释的边界框的空间和时间连接来解释和抵消到达的开始，并检测到完整的到达动作。我们评估了我们提出的方法的效率，并将其性能与其他基于学习的网络结构进行比较，以捕获时间相互依存的能力和触及发作和偏移的检测准确性。结果表明，我们的婴儿网络可以在超过其他较大网络的（平均）测试准确性方面达到稳定的性能，因此可以作为基于视频的婴儿获得动作识别的轻量重量数据驱动框架。

translated by 谷歌翻译

GesSure -- A Robust Face-Authentication enabled Dynamic Gesture Recognition GUI Application

Ankit Jha , Ishita Pratham G. Shenwai , Ayush Batra , Siddharth Kotian , Piyush Modi

分类：计算机视觉

2022-07-22

使用物理互动设备（如小鼠和键盘）阻碍了自然主义的人机相互作用，并增加了大流行期间表面接触的可能性。现有的手势识别系统不具备用户身份验证，使其不可靠。当前手势识别技术中的静态手势会引入较长的适应周期并降低用户兼容性。我们的技术非常重视用户识别和安全。我们使用有意义且相关的手势进行任务操作，从而获得更好的用户体验。本文旨在设计一个强大的，具有面部验证的手势识别系统，该系统利用图形用户界面，主要通过用户识别和授权专注于安全性。面部模型使用MTCNN和FACENET来验证用户，而我们的LSTM-CNN体系结构进行手势识别，并以五类的手势获得了95％的精度。通过我们的研究开发的原型已成功执行了上下文依赖性任务，例如保存，打印，控制视频播放器操作和退出以及无上下文的操作系统任务，例如睡眠，关闭和直观地解锁。我们的应用程序和数据集可作为开源。

translated by 谷歌翻译

Reconstructing the Universe with Variational self-Boosted Sampling

Chirag Modi , Yin Li , David Blei

分类： (统计)机器学习

2022-06-28

从观察到的调查数据中，宇宙学的正向建模方法使在宇宙开头重建初始条件成为可能。但是，参数空间的高维度仍然构成挑战，探索完整的后部，传统算法（例如汉密尔顿蒙特卡洛（HMC））由于产生相关样本而在计算上效率低下发散（损失）功能。在这里，我们开发了一种称为变异自动采样（VBS）的混合方案，以通过学习用于蒙特卡洛采样的建议分布的变异近似来减轻这两种算法的缺点，并将其与HMC结合。变异分布被参数化为正常化的流量，并通过即时生成的样品学习，而从中提取的建议则减少了MCMC链中的自动相关长度。我们的归一化流程使用傅立叶空间卷积和元素的操作来扩展到高维度。我们表明，经过短暂的初始热身和训练阶段，VBS比简单的VI方法产生了更好的样品质量，并将采样阶段的相关长度缩短了10-50倍，仅使用HMC探索初始的后验64 $^3 $和128 $^3 $维度问题的条件，高信噪比数据观察的收益较大。

translated by 谷歌翻译

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

Jinglin Chen , Aditya Modi , Akshay Krishnamurthy , Nan Jiang , Alekh Agarwal

分类：机器学习 | 人工智能 | (统计)机器学习

2022-06-21

我们在一般的非线性函数近似下研究无奖励增强学习（RL），并在各种标准结构假设下建立样品效率和硬度结果。从积极的一面来看，我们提出了在最小的结构假设下进行样品有效奖励探索的Rfolive（无奖励橄榄）算法，该假设涵盖了先前研究的线性MDPS的设置（Jin等，2020b），线性完整性（线性完整性）（ Zanette等人，2020b）和低级MDP，具有未知的表示（Modi等，2021）。我们的分析表明，以前针对后两个设置的易学性或可及性假设在统计上对于无奖励探索而言并不是必需的。在负面方面，我们为在线性完整性假设下的无奖励和奖励意识探索提供统计硬度结果时，当基础特征未知时，显示了低级别和线性完整性设置之间的指数分离。

translated by 谷歌翻译

MoDi: Unconditional Motion Synthesis from Diverse Data

Sigal Raab , Inbal Leibovitch , Peizhuo Li , Kfir Aberman , Olga Sorkine-Hornung , Daniel Cohen-Or

分类：人工智能 | 计算机视觉 | 机器学习

2022-06-16

神经网络的出现彻底改变了运动合成领域。然而，学会从给定的分布中无条件合成动作仍然是一项具有挑战性的任务，尤其是当动作高度多样化时。我们提出了Modi，这是一种无条件的生成模型，可以合成各种动作。我们的模型在完全无监督的环境中训练，从多样化，非结构化和未标记的运动数据集中进行了训练，并产生了一个行为良好，高度语义的潜在空间。我们的模型的设计遵循StyleGAN的多产架构，并将其两个关键技术组件调整为运动域：一组样式编码，这些样式编码注入了生成器层次结构的每个级别和映射功能，并形成了一个学习和形成一个分离的潜在空间。我们表明，尽管数据集中缺乏任何结构，但潜在空间可以在语义上聚集，并促进语义编辑和运动插值。此外，我们提出了一种将未见动作转向潜在空间的技术，并展示了基于潜在的运动编辑操作，否则这些动作无法通过天真地操纵明确的运动表示无法实现。我们的定性和定量实验表明，我们的框架达到了最新的合成质量，可以遵循高度多样化的运动数据集的分布。代码和训练有素的模型将在https://sigal-raab.github.io/modi上发布。

translated by 谷歌翻译